Segmentations with Rearrangements

نویسندگان

  • Aristides Gionis
  • Evimaria Terzi
چکیده

Sequence segmentation is a central problem in the analysis of sequential and time-series data. In this paper we introduce and we study a novel variation to the segmentation problem: in addition to partitioning the sequence we also seek to apply a limited amount of reordering, so that the overall representation error is minimized. Our problem formulation has applications in segmenting data collected from a sensor network where some of the sensors might be slightly out of sync, or in the analysis of newsfeed data where news reports on a few different topics are arriving in an interleaved manner. We formulate the problem of segmentation with rearrangements and we show that it is an NP-hard problem to solve or even approximate. We then proceed to devise effective algorithms for the proposed problem, combining ideas from linear programming, dynamic programming, and outlier-detection algorithms in sequences. We perform extensive experimental evaluation on synthetic and real datasets that demonstrates the efficacy of the suggested algorithms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Problems and Algorithms for Sequence Segmentations

The analysis of sequential data is required in many diverse areas such as telecommunications, stock market analysis, and bioinformatics. A basic problem related to the analysis of sequential data is the sequence segmentation problem. A sequence segmentation is a partition of the sequence into a number of non-overlapping segments that cover all data points, such that each segment is as homogeneo...

متن کامل

Diversity of T-cell receptor Gene Rearrangements in South Indian Patients with Common Acute Lymphoblastic Leukemia

Background: Precursor B-Acute Lymphoblastic Leukemia (precursor B-ALL) oc-curs due to the uncontrolled proliferation of B-lymphoid precursors arrested at a par-ticular stage of B-cell development. Precursor-B-ALL is classified mainly into pro-B-ALL, common-ALL and pre-B-ALL. The Common Acute Lymphoblastic Antigen CD10 is the marker for common-ALL. Objective: This study was aimed to examine the ...

متن کامل

Subtelomeric Rearrangements in Patients with Recurrent Miscarriage

Objective The Subtelomeric rearrangements are increasingly being investigated in cases of idiopathic intellectual disabilities (ID) and congenital abnormalities (CA) but have also been suspected to be responsible for unexplained recurrent miscarriage (RM). We have noticed a higher risk of subtelomeric translocations in association with CA and ID. Such rearrangements can go unnoticed through con...

متن کامل

Plant Classification in Images of Natural Scenes Using Segmentations Fusion

This paper presents a novel approach to automatic classifying and identifying of tree leaves using image segmentation fusion. With the development of mobile devices and remote access, automatic plant identification in images taken in natural scenes has received much attention. Image segmentation plays a key role in most plant identification methods, especially in complex background images. Wher...

متن کامل

A Review of Driver Genetic Alterations in Thyroid Cancers

Thyroid cancer is a frequent endocrine related malignancy with continuous increasing incidence. There has been moving development in understanding its molecular pathogenesis recently mainly through the explanation of the original role of several key signaling pathways and related molecular distributors. Central to these mechanisms are the genetic and epigenetic alterations in these pathways, su...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007